WHAT IS CLAIMED IS: 



1. A method of recognizing a document image 
including a plurality of areas, comprising the steps of: 

a) inputting said document image as a digital 

image ; 

b) specifying a background color of said 
document image ; 

c) extracting a plurality of pixels located in 
areas other than a background area from said document 
image by use of said background color; 

d) creating a plurality of connected elements 
by combining said plurality of pixels; and 

e) classifying said plurality of connected 
elements into a plurality of fixed types of areas by 
using at least features of shapes of said plurality of 
connected elements to obtain an area-separated document 
image . 
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2 . The method as claimed in claim 1 , further 
comprising the steps of: 

f) creating a binary image by binarizing said 
area-separated document image; 

g) classifying a plurality of areas included 
in said binary image into said plurality of fixed types 
of areas; 

h) comparing a result of the step (e) and a 
result of the step (g) ; 

i) correcting said area-separated document 
image if said result of the step (e) is not equal to 
said result of the step (g) ; and 

j) recognizing a character in a text area of 
said area-separated document image. 



3. The method as claimed in claim 1, wherein 
said step (b) includes the steps of: 

k) clustering a plurality of colors on said 
document image; and 

1) setting a representative color of a largest 
cluster obtained by the step (k) to said background 
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4. The method as claimed in claim 3, wherein 
said step (k) includes the steps of: 

sampling each of the plurality of pixels at 
regular intervals; and 
5 clustering the plurality of colors on said 

document image by use of a plurality of pixel values 
obtained by smoothing pixels surrounding said each of 
the plurality of pixels. 
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^ 5. The method as claimed in claim 1, further 

O comprising the step of reducing a size of said document 

fJSK 

i|J 15 image, wherein said step of reducing the size of said 

p document image includes the steps of: 

1 4. 

diving said document image into a plurality of 

blocks ; 

obtaining a representative color of each of 
20 said plurality of blocks; 

determining colors of said plurality of blocks 
after sizes of said plurality of blocks are reduced, by 
comparing said representative color and said background 
color; and 

25 reducing said plurality of blocks into the 



plurality of pixels having said colors. 



6. The method as claimed in claim 5, wherein 
said each of the plurality of blocks is a 3X3 or 4X4 
grating. 



7 . The method as claimed in claim 1 , wherein 
said step (c) includes the step of determining a focused 
pixel as a pixel located in an area other than said 
background area if a difference between three primary 
colors of said background color and said focused pixel 
is larger than a fixed value. 



8. The method as claimed in claim 1, further 
comprising the steps of: 

creating the document image f in which a figure 



or photograph rectangular area separated by said step 
(e) is painted over with a specified color; 

binarizing said document image; and 
recognizing characters on a binary image 
obtained by binarizing said document image. 



9. The method as claimed in claim 1, further 
comprising the step of recursively performing said step 
(e) to a specific rectangular area classified at said 
step (e) . 



10. A method of recognizing a document image, 
comprising the steps of: 

a) inputting said document image as a digital 

image ; 

b) performing color area separation to said 
document image ; 

c) creating a binary image for each area 
separated by said color area separation; 



d) creating a single binary image by combining 
said binary image for each area, thereby performing 
binarization to said document image; 

e) performing binary area separation to said 
single binary image; 

f) comparing a result of said color area 
separation and a result of said binary area separation; 
and 

g) obtaining a binary image and a result of 
area separation by performing a feedback process until a 
certain condition is satisfied, or for a fixed times, in 
accordance with a result of the step (f) . 



11. The method as claimed in claim 10, 
wherein said feedback process is performed in a case in 
which the certain condition is not satisfied in a range 
of said document image as the result of the step (f) , 
said feedback process including the steps of: 

creating an area that includes said range; 

performing said color area separation, said 
binarization and said binary area separation to said 
area; and 



performing said step (f) 



12. The method as claimed in claim 10, 
wherein said feedback process is performed in a case in 
which a text line is extracted from a range in said 
document image by one of said color area separation and 
said binary area separation, and a character rectangle 
including characters is not extracted from said range by 
the other, said feedback process including the steps of: 

specifying a character color of said character 

rectangle ; 

determining that said range includes a 
character if said character color is even throughout 
said range; 

performing said color area separation, said 
binarization , said binary area separation to said range 
by use of said character color; and 

performing said step (f) . 
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13. The method as claimed in claim 12, 
wherein said feedback process includes the steps of: 

creating an area including the text line in a 
case in which said text line extracted from the range by 
5 said color area separation does not exist in the result 
of the binary area separation as the result of said step 
(f ) ; 

performing said binarization and said binary 
area separation to said area; and 
;~ 10 performing said step (f) . 



s p 15 14. The method as claimed in claim 10, 

si i 

p wherein said feedback process is performed in a case in 



which layout features of a fixed number or more than the 
fixed number of text lines are continuously different 
between said result of the color area separation and 
20 said result of the binary area separation as the result 
of said step (f) , said feedback process including the 
steps of: 

creating an area including said text lines; 
binarizing said area; 
25 performing said binary area separation to said 



area; and 

performing said step (f) 



15. The method as claimed in claim 10, 
wherein an image-division-type binarizing method is 
applied to a text area of said document image, and a 
discriminant analysis method is applied to ruled-line, 
figure, and photograph areas. 



16. The method as claimed in claim 10, 
wherein said color area separation includes the steps 
of: 

specifying a background color of said document 

image ; 

extracting a plurality of pixels located 
outside a background area from said document image by 
use of said background color; 

creating a plurality of connected elements by 
combining said plurality of pixels; and 
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classifying said plurality of connected 
elements into a plurality of fixed types of areas by use 
of at least features of shapes of said connected 
elements to obtain an area-separated document image. 



17. A document-image recognition device 
recognizing a document image including a plurality of 
areas , comprising : 

an input unit inputting said document image as 
a digital image ; 

a background-color specifying unit specifying 
a background color of said document image; 

an extracting unit extracting a plurality of 
pixels located outside a background area from said 
document image by use of said background color; 

a creating unit creating a plurality of 
connected elements by combining said plurality of 
pixels; and 

a classifying unit classifying said plurality 
of connected elements into a plurality of fixed types of 
areas by use of at least features of shapes of said 
connected elements to obtain an area-separated document 



image . 



18. The document-image recognition device as 
claimed in claim 17, further comprising: 

a binary-image creating unit creating a binary 
image by binarizing said area-separated document image; 

a correcting unit classifying a plurality of 
areas included in said binary image into said plurality 
of fixed types of areas, and correcting said area- 
separated document image by comparing said area- 
separated document image with a result of classifying 
the plurality of areas; and 

a recognizing unit recognizing a character in 
a text area of said document image. 



19. The document- image recognition device as 
claimed in claim 17, wherein said background-color 
specifying unit includes: 

a clustering unit clustering a plurality of 
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colors on said document image; and 

a setting unit setting a representative color 
of a largest cluster obtained by clustering said 
plurality of colors on said document image to said 
background color. 



20. The document- image recognition device as 
claimed in claim 19, wherein said clustering unit 
includes : 

a sampling unit sampling the plurality of 
pixels at regular intervals; and 

a cluster unit clustering the plurality of 
colors on said document image by use of a plurality of 
pixel values obtained by smoothing pixels surrounding 
said plurality of pixels. 



21. The document- image recognition device as 
claimed in claim 17 , further comprising a reducing unit 
reducing a size of said document image, wherein said 



reducing unit includes: 

a dividing unit diving said document image 
into a plurality of blocks ; 

a representative-color obtaining unit 
obtaining a representative color of each of said 
plurality of blocks; 

a color determining unit determining colors of 
said plurality of blocks after sizes of said plurality 
of blocks are reduced, by comparing said representative 
color and said background color; and 

a block reducing unit reducing said plurality 
of blocks into the plurality of pixels having said 
colors. 



22. The document- image recognition device as 
claimed in claim 21, wherein each of said plurality of 
blocks is a 3X3 or 4X4 grating. 



23. The document-image recognition device as 
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claimed in claim 17 , wherein said extracting unit 
includes a pixel determining unit determining a focused 
pixel as a pixel located outside said background area if 
a difference between three primary colors of said 
5 background color and said focused pixel is larger than a 
fixed value . 
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24. The document-image recognition device as 
claimed in claim 17, further comprising: 

a document- image creating unit creating the 
document image, in which a figure or photograph 
15 rectangular area separated by said classifying unit is 
painted over with a specified color; 

a binarizing unit binarizing said document 

image ; and 

a character recognizing unit recognizing 
20 characters on a binary image obtained by binarizing said 
document image. 
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25. The document- image recognition device as 
claimed in claim 17, recursively carrying out a process 
performed by said classifying unit to a specific 
rectangular area classified by said classifying unit. 



26. A document-image recognition device 
recognizing a document image, comprising: 

an input unit inputting said document image as 
a digital image ; 

a color, area separation unit performing color 
area separation to said document image; 

a binary-image creating unit creating a binary 
image for each area separated by said color area 
separation ; 

a binary area separation unit creating a 
single binary image by combining said binary image for 
each area, thereby performing binarization to said 
document image, and performing binary area separation to 
said single binary image; 

a comparing unit comparing a result of said 
color area separation and a result of said binary area 
separation; and 



an obtaining unit obtaining a binary image and 
a result of area separation by performing a feedback 
process until a certain condition is satisfied, or for a 
fixed times, in accordance with a result of comparison 
carried out by said comparing unit. 



27. The document-image recognition device as 
claimed in claim 26, wherein said feedback process is 
performed in a case in which a text line is extracted 
from a range in said document image by one of said color 
area separation and said binary area separation, and a 
character rectangle including characters is not 
extracted from said range by the other, said feedback 
process including the steps of: 

specifying a character color of said character 

rectangle; 

determining that said range includes a 
character if said character color is even throughout 
said range; 

performing said color. area separation, said 
binarization, said binary area separation to said range 
by use of said character color; and 



performing 



said 



comparison . 



28. The document-image recognition device as 
claimed in claim 27, wherein said feedback process 
includes the steps of: 

creating an area including the text line in a 
case in which said text line extracted from the range by 
said color area separation does not exist in the result 
of the binary area separation as the result of said 
comparison; 

performing said binarization and said binary 
area separation to said area; and 

performing said comparison. 



29. The document-image recognition device as 
claimed in claim 26, wherein said feedback process is 
performed in a case in which layout features of a fixed 
number or more than the fixed number of text lines are 
continuously different between said result of the color 



area separation and said result of the binary area 
separation as the result of said comparison, said 
feedback process including the steps of: 

creating an area including said text lines; 

binarizing said area; 

performing said binary area separation to said 

area; and 

performing said comparison. 



30. A record medium readable by a computer, 
tangibly embodying a program of instructions executable 
by the computer to carry out a document-image 
recognition process, said instructions comprising the 
steps of: 

a) inputting said document image as a digital 

image ; 

b) specifying a background color of said 
document image; 

c) extracting a plurality of pixels located 
outside a background area from said document image by 
use of said background color; 

d) creating a plurality of connected elements 
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by combining said plurality of pixels; and 

e) classifying said plurality of connected 
elements into a plurality of fixed types of areas by use 
of at least features of shapes of said connected 
5 elements to obtain an area-separated document image. 



O 

hU 10 31. The record medium as claimed in claim 30, 

m 

fsj wherein said instructions further includes the steps of: 

IB 

l2 f) creating a binary image by binarizing said 

m area- separated document image;. 

i«% g) classifying a plurality of areas included 

[If!- 15 in said binary image into said plurality of fixed types 

of areas; 

I u 

I"* h) comparing a result of the step (e) and a 

result of the step (g) ; 

i) correcting said area-separated document 
20 image if said result of the step (e) is not equal to 
said result of the step (g) ; and 

j) recognizing a character in a text area of 
said area-separated document image. 
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32. A record medium readable by a computer, 
tangibly embodying a program of instructions executable 
by the computer to carry out a document-image 
recognition process, said instructions comprising the 
steps of: 

a) inputting said document image as a digital 

image ; 

b) performing color area separation to said 
document image; 

c) creating a binary image for each area 
separated by said color area separation; 

d) creating a single binary image by combining 
said binary image for each area, thereby performing 
binarization to said document image; 

e) performing binary area separation to said 
single binary image; 

f) comparing a result of said color area 
separation and a result of said binary area separation; 
and 

g) obtaining a binary image and a result of 
area separation by performing a feedback process until a 
certain condition is satisfied, or for a fixed times, in 
accordance with a result of the step (f ) . 



